How to Train Object Detection Net
Yolov3 is a real time multiple object detector that follows single shot detection architecture. The below training steps are compiled following the instructions at AlexeyAB-Darknet
Labeling
We label each object on images from a certain dataset with the visual GUI-software for marking bounded boxes of objects AlexeyAB-Yolo_mark and generating annotation files. The convention is as follows:
<object-class> <x_center> <y_center> <width> <height>
<object-class> - integer object number from 0 to (classes-1)
<x> = <absolute_x> / <image_width> or <height> = <absolute_height> / <image_height>
<x_center> <y_center> - are center of rectangle (are not top-left corner)
Create file train.txt in directory
build\darknet\x64\data\
, with filenames of your images (relative image path). This file will be generated automatically if you use AlexeyAB-Yolo_mark to annotate your data.Download pre-trained weights for the convolutional layers (154 MB): and put to the directory builddarknetx64.
Optimize anchors
Recalculate anchors for your dataset for width and height from cfg-file:
./darknet detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416
then set the same 9 anchors
in each of 3 [yolo]
-layers in your cfg-file.
But you should change indexes of anchors masks=
for each [yolo]
-layer, so that 1st-[yolo]
-layer has anchors larger than 60x60,
2nd larger than 30x30, 3rd remaining. Also you should change the filters=(classes + 5)*<number of mask>
before each [yolo]-layer
.
If many of the calculated anchors do not fit under the appropriate layers - then just try using all the default anchors.
Training
Clone the repository and build using
cmake . && make
.Create file yolo-obj.cfg with the same content as in
yolov3.cfg
(or copy yolov3.cfg to yolo-obj.cfg)Change line batch to
batch=64
Change line subdivisions to
subdivisions=8
Change line max_batches to (classes*2000), f.e.
max_batches=6000
if you train for 3 classesChange line steps to 80% and 90% of max_batches, f.e.
steps=4800,5400
Change line
classes=80
to your number of object classes in each of 3[yolo]
-layers:Change
[filters=255]
to filters=(classes + 5)x3 in the 3[convolutional]
before each[yolo]
layer (if classes=2 then writefilters=21
)Set flag
random=1
in your .cfg-file - it will increase precision by training Yolo for different resolutionsCreate file
obj.names
in the directorybuild\darknet\x64\data\
, with objects names - each in new lineCreate file
obj.data
in the directorybuild\darknet\x64\data\
, containing below;
classes= 2
train = data/train.txt
valid = data/test.txt
names = data/obj.names
backup = backup/
Put image-files (.jpg) of your objects in the directory
build\darknet\x64\data\obj\
To train on command:
./darknet detector train data/obj.data yolo-obj.cfg darknet53.conv.74
. Fileyolo-obj_last.weights
will be saved to thebuild\darknet\x64\backup\
for each 100 iterations.After each 100 iterations you can stop and later start training from this point. For example, after 2000 iterations you can stop training, and later just start training using:
./darknet detector train data/obj.data yolo-obj.cfg backup\yolo-obj_2000.weights
Note: If during training you see nan values for avg (loss) field - then training goes wrong, but if nan is in some other lines - then training goes well.
Note: If you changed width=
or height=
in your cfg-file, then new width and height must be divisible by 32.
Note: if error Out of memory occurs then in .cfg-file you should increase subdivisions=16
, 32 or 64:
Stop Training
During training, you will see varying indicators of error, and you should stop when no longer decreases 0.XXXXXXX avg:
Region Avg IOU: 0.798363, Class: 0.893232, Obj: 0.700808, No Obj: 0.004567, Avg Recall: 1.000000, count: 8 Region Avg IOU: 0.800677, Class: 0.892181, Obj: 0.701590, No Obj: 0.004574, Avg Recall: 1.000000, count: 8
9002: 0.211667, 0.060730 avg, 0.001000 rate, 3.868000 seconds, 576128 images Loaded: 0.000000 seconds
9002 - iteration number (number of batch) 0.060730 avg - average loss (error) - the lower, the better
Once training is stopped, you should take some of last .weights-files from
darknet\build\darknet\x64\backup
and choose the best of them. For example, you stopped training after 9000 iterations, but the best result can give one of previous weights (7000, 8000, 9000). It can happen due to overfitting.At first, in your file
obj.data
you must specify the path to the validation datasetvalid = valid.txt
(format of valid.txt as in train.txt), and if you haven’t validation images, just copydata\train.txt
todata\valid.txt
. However it’s best to first divide the label data to train and valid sets.If training is stopped after 9000 iterations, to validate some of previous weights use this commands:
darknet detector map data/obj.data yolo-obj.cfg backup\yolo-obj_9000.weights
darknet detector map data/obj.data yolo-obj.cfg backup\yolo-obj_8000.weights
darknet detector map data/obj.data yolo-obj.cfg backup\yolo-obj_7000.weights
Choose weights-file with the highest mAP (mean average precision) or IoU (intersect over union)
After Training
To increase network resolution in your .cfg-file (
height=608
,width=608
or any value multiple of 32) - it will increase precision at higher running time.